-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hamming, jensen-shannon, kl-divergence, correlation and russellrao distance metrics #4155
Add hamming, jensen-shannon, kl-divergence, correlation and russellrao distance metrics #4155
Conversation
…ivergence and russell-rao distance prims
…rking except some bug in kldivergence
@dantegd the cuml build looks to be broken due to this error below, is there a PR which is working on fixing this? https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cuml/job/prb/job/cuml-cpu-cuda-build/CUDA=11.2/1452/console |
@mdoijade |
I see. thanks! merged it in my RAFT PR now. |
@dantegd all the test runs are failing with some CI error - https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cuml/job/prb/job/cuml-cpu-python-build/CUDA=11.2,PYTHON=3.7/1239/console Also can we merge the raft PR now, as I believe it shouldn't break cuml builds as here all builds looks fine ? - rapidsai/raft#306 |
Can one of the admins verify this patch? |
@dantegd all the test runs are failing with some CI error, is it due to using custom raft fork ? 00:29:19 Cloning into '_external_repositories/raft'... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question and a reminder to change the raft branch back, otherwise this looks good.
cpp/cmake/thirdparty/get_raft.cmake
Outdated
GIT_REPOSITORY https://github.com/${PKG_FORK}/raft.git | ||
GIT_TAG ${PKG_PINNED_TAG} | ||
GIT_REPOSITORY https://github.com/mdoijade/raft.git | ||
GIT_TAG additionalDistPrims |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder to update this now that the raft side has been merged
|
||
// Call the distance function | ||
switch (metric) { | ||
case raft::distance::DistanceType::CorrelationExpanded: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the switch state propagated into this in case there is also an unexpanded version in the future? (Asking for the other distances as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No there is no unexpanded version planned for this or other distance metrics. should I replace switch()
with an if
condition ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These functions are named after the associated distance measure so I would call a function named pairwise_distance_correlation expecting it to compute the correlation distance. If it's not possible that this will ever compute a distance other than correlation, why are we passing the DistanceType into these functions at all? Rather than using an if condition, I would propose removing the metric argument altogether and just having the function compute the distance for which its name implies. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I think the metric arg got propagated to all distances due to euclidean dist having multiple sub-implementations.
I've removed this redundant check & arg now for all non-euclidean metrics which doesn't have multiple implementations.
@teju85 can I get a json.decoder.JSONDecodeError: Unterminated string starting at: line 2407838 column 18 (char 72712781) Traceback (most recent call last): |
rerun tests |
…contains single distance metric implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Majesh. The changes look great! Just pending CI now.
@dantegd before merging branch-21.10 recent changes in my PR all pytests related to new distance metrics were passing, @teju85 for visibility. |
Sorry for the late reply, I was on vacation. Thank you for working on this. I took a look at this PR and the RAFT changes. It's hard to tell at the moment which change might cause the issue. However, I could reproduce the issue locally. For this, you may need to either update your environment or to remove RAFT package from your conda env ( |
rerun tests |
Codecov Report
@@ Coverage Diff @@
## branch-21.10 #4155 +/- ##
===============================================
Coverage ? 85.98%
===============================================
Files ? 231
Lines ? 18515
Branches ? 0
===============================================
Hits ? 15920
Misses ? 2595
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
can this PR get merged now? |
@gpucibot merge |
…o distance metrics (rapidsai#4155) -- This PR depends on RAFT PR - rapidsai/raft#306 -- Adds cpp & python interfaces for these distance metrics with pytest support for each of them. -- also remove redundant commented code in canberra distance metric Authors: - Mahesh Doijade (https://github.com/mdoijade) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4155
-- This PR depends on RAFT PR - rapidsai/raft#306
-- Adds cpp & python interfaces for these distance metrics with pytest support for each of them.
-- also remove redundant commented code in canberra distance metric